Semi-Supervised Fuzzy-Rough Feature Selection
نویسندگان
چکیده
With the continued and relentless growth in dataset sizes in recent times, feature or attribute selection has become a necessary step in tackling the resultant intractability. Indeed, as the number of dimensions increases, the number of corresponding data instances required in order to generate accurate models increases exponentially. Fuzzy-rough set-based feature selection techniques offer great flexibility when dealing with real-valued and noisy data; however, most of the current approaches focus on the supervised domain where the data object labels are known. Very little work has been carried out using fuzzy-rough sets in the areas of unsupervised or semi-supervised learning. This paper proposes a novel approach for semi-supervised fuzzy-rough feature selection where the object labels in the data may only be partially present. The approach also has the appealing property that any generated subsets are also valid (super)reducts when the whole dataset is labelled. The experimental evaluation demonstrates that the proposed approach can generate stable and valid subsets even when up to 90% of the data object labels are missing.
منابع مشابه
A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملFuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection
Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...
متن کاملOn the use of evolutionary feature selection for improving fuzzy rough set based prototype selection
The k-nearest neighbors classifier is a widely used classification method that has proven to be very effective in supervised learning tasks. In this paper, a fuzzy rough set method for prototype selection, focused on optimizing the behavior of this classifier, is presented. The hybridization with an evolutionary feature selection method is considered to further improve its performance, obtainin...
متن کاملApplications of Fuzzy Rough Set Theory in Machine Learning: a Survey
Data used in machine learning applications is prone to contain both vague and incomplete information. Many authors have proposed to use fuzzy rough set theory in the development of new techniques tackling these characteristics. Fuzzy sets deal with vague data, while rough sets allow to model incomplete information. As such, the hybrid setting of the two paradigms is an ideal candidate tool to c...
متن کاملApplication of Fuzzy-rough Set Theory for Feature Subset Selection
Fuzzy Set Theory and Rough Set Theory are the most popular mathematical tools for dealing with uncertainties. During past decades, these set theories are being applied successfully in several areas for solving many complex tasks. This paper is concerned with the application of hybrid Fuzzy-Rough set based approach for feature subset selection. Keywords— Fuzzy set theory, Rough Set theory, Fuzzy...
متن کامل